1: Geospatial Analytics for Public Good

Author

Magdalene Chan

Published

November 25, 2023

Modified

December 3, 2023

Objectives

As city-wide urban infrastructures become increasingly digital, datasets from technologies like GPS and RFID on vehicles offer opportunities to track movement patterns over space and time. For instance, smart cards and GPS devices on public buses collect routes and ridership data, containing valuable structure and patterns for understanding human movement and behavior within cities. Despite their potential, the practical use of these extensive location-aware datasets often remains limited to basic tracking and mapping within GIS applications due to the lack of comprehensive spatial and spatio-temporal analysis functions in conventional GIS tools.

Exploratory Spatial Data Analysis (ESDA) holds tremendous potential to address such complex problems. In this study, appropriate Local Indicators of Spatial Association (GLISA) and Emerging Hot Spot Analysis (EHSA) will be applied to undercover the spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.

Tasks

The following tasks will be undertaken in this exercise:

Geovisualisation and Analysis

  1. Compute the passenger trips generated by origin at a hexagon level of 250m for the following time periods:
    1. Weekday morning peak – 6am to 9am (inclusive)
    2. Weekday evening peak – 5pm to 8pm (inclusive)
    3. Weekend/holiday morning peak – 11am to 2pm (inclusive)
    4. Weekend/holiday evening peak – 4pm to 7pm (inclusive)
  2. Display the geographical distribution of the passenger trips using appropriate geovisualisation methods.
  3. Describe the spatial patterns revealed by the geovisualisation (not more than 200 words per visual).

Emerging Hot Spot Analysis

With reference to the passenger trips generated by origin at the hexagon level for the four time periods given above:

  1. Perform Mann-Kendall Test by using the spatio-temporal local Gi* values,
  2. Prepare EHSA maps of the Gi* values of the passenger trips by origin at the hexagon level. The maps should only display the significant (i.e. p-value < 0.05).
  3. With reference to the EHSA maps and data visualisation prepared, describe the spatial patterns revealed (not more than 250 words per cluster).

Getting Started

The code chunk below uses p_load() of pacman package to check if the required packages have been installed on the computer. If they are, the packages will be launched. The following packages will be used:

  • sf package is used for importing, managing, and processing geospatial data.
  • sfdep package is used to create spatial weights matrix and LISA objects using the sf class to represent spatial data.
  • tmap package is used for thematic mapping.
  • plotly package is used to create interactive graphs and charts.
  • tidyverse package is used for aspatial data wrangling.
  • knitr package is used for dynamic report generation in R.
Code
pacman::p_load(sf, sfdep, tmap, plotly, tidyverse, knitr)

Import Data

The data sets used are:

  • Bus Stop Location (Last updated Jul 2023) from LTADataMall retrieved on 18 Nov 2023
  • Passenger Volume by Origin Destination Bus Stops for August 2023 from LTADataMall retrieved on 18 Nov 2023

Import Geospatial Data: Bus Stop Locations

The code chunk below uses the st_read() function of sf package to import BusStop shapefile into R as a simple feature data frame called BusStop. As BusStop uses svy21 projected coordinate system, the crs is set to 3414.

Code
BusStop <- st_read(dsn = "data/geospatial", 
                layer = "BusStop") %>%
  st_transform(crs=3414)
Reading layer `BusStop' from data source 
  `C:\magdalenecjw\ISSS624 Geospatial\Take_Home_Exercise\Ex1\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
Code
# Examine the structure of the data frame
str(BusStop)
Classes 'sf' and 'data.frame':  5161 obs. of  4 variables:
 $ BUS_STOP_N: chr  "22069" "32071" "44331" "96081" ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5161; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

There are a total of 5161 features in the BusStop shapefile. Notably, BUS_STOP_N is listed as a character variable. As this variable will be used as the identifier to link to the aspatial data, it should be transformed to a factor so that R treats it as a grouping variable.

Code
# Apply as.factor() to the column
BusStop$BUS_STOP_N <- as.factor(BusStop$BUS_STOP_N)

# Re-examine the structure of the data frame
str(BusStop)
Classes 'sf' and 'data.frame':  5161 obs. of  4 variables:
 $ BUS_STOP_N: Factor w/ 5145 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5161; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

Based on the output above, BUS_STOP_N is now a factor of 5145 levels. However, there are a total of 5161 observations. As such, the duplicate records should be checked. The code chunk below shows a list of duplicated bus stop codes.

Code
duplicate <- BusStop %>%
  group_by(BUS_STOP_N) %>%
  filter(n() > 1) %>%
  arrange(BUS_STOP_N)

kable(duplicate)
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
11009 B04 Ghim Moh Ter POINT (23101.34 32594.17)
11009 B04-TMNL GHIM MOH TER POINT (23100.58 32604.36)
22501 B02 Blk 662A POINT (13489.09 35536.4)
22501 B02 BLK 662A POINT (13488.02 35537.88)
43709 B06 BLK 644 POINT (18963.42 36762.8)
43709 B06 BLK 644 POINT (18952.02 36751.83)
47201 UNK NA POINT (22616.75 47793.68)
47201 NIL W’LANDS NTH STN POINT (22632.92 47934)
51071 B21 MACRITCHIE RESERVOIR POINT (28311.27 36036.92)
51071 B21 MACRITCHIE RESERVOIR POINT (28282.54 36033.93)
52059 B03 OPP BLK 65 POINT (30770.3 34460.06)
52059 B09 BLK 219 POINT (30565.45 36133.15)
53041 B05 Upp Thomson Road POINT (28105.8 37246.76)
53041 B07 Upp Thomson Road POINT (27956.34 37379.29)
58031 UNK OPP CANBERRA DR POINT (27089.69 47570.9)
58031 UNK OPP CANBERRA DR POINT (27111.07 47517.77)
62251 B03 Bef Blk 471B POINT (35500.54 39943.41)
62251 B03 BEF BLK 471B POINT (35500.36 39943.34)
67421 B01 CHENG LIM STN EXIT B POINT (34548.54 42052.15)
67421 NIL CHENG LIM STN EXIT B POINT (34741.77 42004.21)
68091 B01 AFT BAKER ST POINT (32164.11 42695.98)
68091 B08 AFT BAKER ST POINT (32038.84 43298.68)
68099 B02 BEF BAKER ST POINT (32154.9 42742.82)
68099 B07 BEF BAKER ST POINT (32004.05 43320.34)
77329 B01 BEF PASIR RIS ST 53 POINT (40765.35 39452.18)
77329 B03 Pasir Ris Central POINT (40728.15 39438.15)
82221 B01 BLK 3A POINT (35323.6 33257.05)
82221 B01 Blk 3A POINT (35308.74 33335.17)
96319 NA Yusen Logistics POINT (42187.23 34995.78)
96319 NIL YUSEN LOGISTICS POINT (42187.23 34995.78)
97079 B14 OPP ST. JOHN’S CRES POINT (44144.57 38980.25)
97079 B14 OPP ST. JOHN’S CRES POINT (44055.75 38908.5)

The code chunk below uses the distinct() function to keep only the unique rows based on the BUS_STOP_N while preserving all other fields (.keep_all = TRUE). By default, distinct() keeps the first occurrence of each unique combination of values in the specified columns.

Code
BusStop <- BusStop %>%
  distinct(BUS_STOP_N, .keep_all = TRUE)

# Re-examine the structure of the data frame
str(BusStop)
Classes 'sf' and 'data.frame':  5145 obs. of  4 variables:
 $ BUS_STOP_N: Factor w/ 5145 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5145; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

There are now a total of 5145 observations, aligned with the number of factor levels.

Import Passenger Volume by Origin-Destination Bus Stops

The code chunk below uses the read_csv() function of readr package (imported with the tidyverse package) to import the csv files into R.

Code
odbus <- read_csv("data/aspatial/origin_destination_bus_202308.csv")

# Examine the structure of the data frame
str(odbus)
spc_tbl_ [5,709,512 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ YEAR_MONTH         : chr [1:5709512] "2023-08" "2023-08" "2023-08" "2023-08" ...
 $ DAY_TYPE           : chr [1:5709512] "WEEKDAY" "WEEKENDS/HOLIDAY" "WEEKENDS/HOLIDAY" "WEEKDAY" ...
 $ TIME_PER_HOUR      : num [1:5709512] 16 16 14 14 17 17 17 17 7 17 ...
 $ PT_TYPE            : chr [1:5709512] "BUS" "BUS" "BUS" "BUS" ...
 $ ORIGIN_PT_CODE     : chr [1:5709512] "04168" "04168" "80119" "80119" ...
 $ DESTINATION_PT_CODE: chr [1:5709512] "10051" "10051" "90079" "90079" ...
 $ TOTAL_TRIPS        : num [1:5709512] 7 2 3 10 5 4 3 22 3 3 ...
 - attr(*, "spec")=
  .. cols(
  ..   YEAR_MONTH = col_character(),
  ..   DAY_TYPE = col_character(),
  ..   TIME_PER_HOUR = col_double(),
  ..   PT_TYPE = col_character(),
  ..   ORIGIN_PT_CODE = col_character(),
  ..   DESTINATION_PT_CODE = col_character(),
  ..   TOTAL_TRIPS = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Based on the data frame structure seen above, ORIGIN_PT_CODE and DESTINATION_PT_CODE are listed as character variables. These variables are equivalent to BUS_STOP_N of BusStop sf data frame and should be transformed to factors so that R treats them as grouping variables.

Code
# Columns to convert to factors
columns_to_convert <- c("ORIGIN_PT_CODE", "DESTINATION_PT_CODE")

# Apply as.factor() to the adjusted columns
odbus[columns_to_convert] <- lapply(odbus[columns_to_convert], as.factor)

# Re-examine the structure of the data frame
str(odbus)
spc_tbl_ [5,709,512 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ YEAR_MONTH         : chr [1:5709512] "2023-08" "2023-08" "2023-08" "2023-08" ...
 $ DAY_TYPE           : chr [1:5709512] "WEEKDAY" "WEEKENDS/HOLIDAY" "WEEKENDS/HOLIDAY" "WEEKDAY" ...
 $ TIME_PER_HOUR      : num [1:5709512] 16 16 14 14 17 17 17 17 7 17 ...
 $ PT_TYPE            : chr [1:5709512] "BUS" "BUS" "BUS" "BUS" ...
 $ ORIGIN_PT_CODE     : Factor w/ 5067 levels "01012","01013",..: 104 104 4422 4422 2008 2008 832 832 779 355 ...
 $ DESTINATION_PT_CODE: Factor w/ 5071 levels "01012","01013",..: 239 239 4736 4736 691 691 807 807 234 107 ...
 $ TOTAL_TRIPS        : num [1:5709512] 7 2 3 10 5 4 3 22 3 3 ...
 - attr(*, "spec")=
  .. cols(
  ..   YEAR_MONTH = col_character(),
  ..   DAY_TYPE = col_character(),
  ..   TIME_PER_HOUR = col_double(),
  ..   PT_TYPE = col_character(),
  ..   ORIGIN_PT_CODE = col_character(),
  ..   DESTINATION_PT_CODE = col_character(),
  ..   TOTAL_TRIPS = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 

Based on the output above, ORIGIN_PT_CODE and DESTINATION_PT_CODE are now factors.

Extract Commuting Flow data

The code chunk below extracts commuting flows for the four target time periods. For completeness, a list of all possible Bus Stops x Day Time x Time combinations is created. Following which, the commuting flow for each Bus Stops x Day Time x Time combination within the odbus data frame is computed and missing combinations are imputed with a value of 0. Lastly the resultant data frame is pivoted wider to form a data frame where each row represents one bus stop code, similar to the BusStop sf data frame.

Code
# Create a list of possible bus stop code, day type and time per hour for checking
BusStopList <- expand.grid(ORIGIN_PT_CODE = unique(BusStop$BUS_STOP_N),
                          DAY_TYPE = c("WEEKDAY", "WEEKENDS/HOLIDAY"),
                          TIME_PER_HOUR = unique(odbus$TIME_PER_HOUR))

# Commute commuting flow by bus stop code, day type and time per hour
odbus_recoded <- odbus %>%
  group_by(ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR) %>%
  summarise(TRIPS = sum(TOTAL_TRIPS)) 

# Identify missing Bus Stops and impute a 0 value
odbus_missing <- BusStopList %>%
  anti_join(odbus_recoded, by = c("ORIGIN_PT_CODE", "DAY_TYPE", "TIME_PER_HOUR")) %>%
  mutate(TRIPS = 0)

# Combine the above two data frames
odbus_long <- rbind(odbus_recoded, odbus_missing) %>%
    mutate(peak_period = case_when(
    DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 6 & TIME_PER_HOUR <=9 ~ 
      "weekday_morn",
    DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 17 & TIME_PER_HOUR <=20 ~ 
      "weekday_evening",
    DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 11 & TIME_PER_HOUR <=14 ~ 
      "weekend_morn",
    DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 16 & TIME_PER_HOUR <=19 ~ 
      "weekend_evening",
    TRUE ~ "non_peak")) 

# Pivot wider to form data frame with one bus stop code per row
odbus_wide_peak <- odbus_long %>%
  group_by(ORIGIN_PT_CODE, peak_period) %>%
  summarise(TOTAL_TRIPS = sum(TRIPS)) %>%
  pivot_wider(names_from = peak_period,
              values_from = TOTAL_TRIPS)

odbus_wide <- odbus_long %>%
  filter(!(peak_period == "non_peak")) %>%
  group_by(ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR) %>%
  summarise(TOTAL_TRIPS = sum(TRIPS)) %>%
  mutate(Day_Time = paste(DAY_TYPE, TIME_PER_HOUR)) %>%
  ungroup %>%
  select(-c(DAY_TYPE, TIME_PER_HOUR)) %>%
  pivot_wider(names_from = Day_Time,
              values_from = TOTAL_TRIPS) %>%
  left_join(odbus_wide_peak)

Append Commuting Flow data with Bus Stop Locations

left_join() of dplyr is then used to join the geographical data and commuting flow data table using the Bus Stop Code as the common identifier. Left join is done to ensure that the geospatial properties (geometry column) of the BusStop sf data frame is retained and bus stops not found within the geospatial data set are dropped.

Code
odBusStop <- left_join(BusStop, odbus_wide, by = c("BUS_STOP_N" = "ORIGIN_PT_CODE"))

str(odBusStop)
Classes 'sf' and 'data.frame':  5145 obs. of  25 variables:
 $ BUS_STOP_N         : Factor w/ 5199 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N         : chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC           : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ WEEKDAY 6          : num  0 0 497 67 46 ...
 $ WEEKDAY 7          : num  9 0 623 128 36 ...
 $ WEEKDAY 8          : num  1 0 631 62 60 ...
 $ WEEKDAY 9          : num  5 0 373 50 48 ...
 $ WEEKDAY 17         : num  19 0 654 254 47 ...
 $ WEEKDAY 18         : num  17 0 485 104 60 ...
 $ WEEKDAY 19         : num  6 0 382 33 58 70 149 284 841 221 ...
 $ WEEKDAY 20         : num  1 0 221 8 30 42 18 140 610 91 ...
 $ WEEKENDS/HOLIDAY 11: num  0 0 178 27 12 57 3 415 502 234 ...
 $ WEEKENDS/HOLIDAY 12: num  2 0 201 39 32 44 3 447 551 246 ...
 $ WEEKENDS/HOLIDAY 13: num  1 0 183 17 24 27 4 320 560 206 ...
 $ WEEKENDS/HOLIDAY 14: num  3 0 138 27 13 30 0 254 357 172 ...
 $ WEEKENDS/HOLIDAY 16: num  6 1 130 21 21 34 17 131 465 232 ...
 $ WEEKENDS/HOLIDAY 17: num  4 0 151 23 10 41 37 133 541 171 ...
 $ WEEKENDS/HOLIDAY 18: num  1 0 135 29 11 57 15 142 499 167 ...
 $ WEEKENDS/HOLIDAY 19: num  1 0 130 11 19 41 15 91 368 140 ...
 $ non_peak           : num  163 6 3813 935 412 ...
 $ weekday_evening    : num  43 0 1742 399 195 ...
 $ weekday_morn       : num  15 0 2124 307 190 ...
 $ weekend_evening    : num  12 1 546 84 61 ...
 $ weekend_morn       : num  6 0 700 110 81 ...
 $ geometry           :sfc_POINT of length 5145; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr [1:24] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC" "WEEKDAY 6" ...

There are a total of 5145 observations, which matches the number of features in the BusStop sf data frame.

Geovisualisation and Analysis

Create Spatial Grids

Spatial grids are commonly used in spatial analysis to divide the study area into equal-sized, regular polygons that can tessellate the whole study area. After that, a variable is summarized within each polygon. For this exercise, a hexagon layer of 250m (where 250m is the perpendicular distance between the hexagon centre and its edges) will be created.

In the code chunk below, st_make_grid is used to create the hexagon grids using the svy21 projected coordinate system. In st_make_grid, cellsize argument refers to the cell size in the units that the crs of the spatial data is using. The cellsize can be defined as the distance between opposite edges. Since the data is set in SVY21 projected coordinate system, which uses metres as the unit, the value is set as c(500,500) to create a hexagon layer of 250m. The resulting data frame is then converted into a sf data frame and an index is added for each hexagon grid.

Code
hex_grid <- st_make_grid(odBusStop, cellsize = c(500, 500), 
                         crs = 3413, what = "polygons", square = FALSE) %>%
  st_sf() %>%
  mutate(index = row_number())

In the code chunk below, st_join and st_intersects() is used to return a list of bus stops lying within each hexagon grid. A left join is done in this process (using the argument left = TRUE) to ensure that the geospatial properties (geometry column) of the hex_grid sf data frame is retained. After filtering out hexagon grids that do not contain any bus stops, the number of bus stops and the total number of trips at each time period is then computed for each hexagon grid.

Code
hex_grid_count <- st_join(hex_grid, odBusStop, join = st_intersects, left = TRUE) %>%
  filter(!(is.na(BUS_STOP_N))) %>%
  group_by(index) %>%
  summarise(
    count = n(),
    bus_stop_codes = paste(BUS_STOP_N, collapse = ", "),
    bus_stop_names = paste(LOC_DESC, collapse = ", "),
    across(starts_with("week"), sum)
  ) %>%
  ungroup()

The hex_grid_count sf data frame can now be used for plotting the geographical distribution of peak hour bus stops and commuting flow using tmap.

Before moving on to the data visualisation, save hex_grid and hex_grid_count sf data frames into rds file format.

Code
write_rds(hex_grid, "data/rds/hex_grid.rds")
write_rds(hex_grid_count, "data/rds/hex_grid_count.rds")

Visualisation of Geographical Distribution of Commuting Flow

Continuous Scale

For efficiency of plotting, select just the columns that will be used for this section.

Code
hex_grid_count2 <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         weekday_evening, weekday_morn, weekend_evening, weekend_morn)

The following code chunks will review the geographical distribution of the weekdays and weekends peak hour commuting flows using a continuous scale fill.

The geographical distribution of the weekdays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code
tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekday_morn", "weekday_evening"),
          palette = "Blues",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekday Morning" = "weekday_morn", 
                         "Weekday Evening" = "weekday_evening")) + 
  tm_layout(title = c("Weekday Morning", "Weekday Evening")) +
  tm_facets(ncol = 1, free.scales = FALSE)
Takeaway

The visualisation above shows a heavily right-skewed distribution across all the weekday peak hour time periods. Most bus stops have less than 100k commuter trips at each of these time periods. Notable hexagon grids with significantly more commuting flow >300k are:

  • Weekday Mornings: 1251, 2411
  • Weekday Evenings: 1251, 2411, 3239, 4349, 4539

The code chunk below displays the hexagon grids with significantly more commuting flow >300k in greater detail.

Code
weekday_300k <- hex_grid_count2 %>%
  filter(weekday_morn >= 300000 | weekday_evening >= 300000) %>%
  select(index, bus_stop_names, weekday_morn, weekday_evening)

kable(weekday_300k)
index bus_stop_names weekday_morn weekday_evening geometry
1251 BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN 396701 551636 POLYGON ((13720.12 35286.69…
2411 BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR 313192 474311 POLYGON ((22970.12 46112.01…
3239 BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN 212340 325858 POLYGON ((29720.12 38750.79…
4349 BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A 187052 370328 POLYGON ((38720.12 33554.64…
4539 TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D 202366 446207 POLYGON ((40220.12 37018.74…

The geographical distribution of the weekends / holidays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code
tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekend_morn", "weekend_evening"),
          palette = "Purples",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekend/PH Morning" = "weekend_morn", 
                         "Weekend/PH Evening" = "weekend_evening")) + 
  tm_layout(title = c("Weekend/PH Morning", "Weekend/PH Evening")) +
  tm_facets(ncol = 1, free.scales = FALSE)
Takeaway

Similarly, the visualisation above shows a heavily right-skewed distribution across all the weekend peak hour time periods. However, the continuous scale suggests that weekend peak hour commuting flow volume is approximately one-fifth of weekday peak hour commuting flow volume. Most bus stops have less than 20k commuter trips at each of these time periods. Notable hexagon grids with significantly more commuting flow >80k are:

  • Weekday Mornings: 1251, 2411, 4349, 4539
  • Weekday Evenings: 1251, 2411, 3239, 3578, 4349, 4539

The code chunk below displays the hexagon grids with significantly more commuting flow >80k in greater detail.

Code
weekend_80k <- hex_grid_count2 %>%
  filter(weekend_morn >= 80000 | weekend_evening >= 80000) %>%
  select(index, bus_stop_names, weekend_morn, weekend_evening)

kable(weekend_80k)
index bus_stop_names weekend_morn weekend_evening geometry
1251 BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN 108551 149047 POLYGON ((13720.12 35286.69…
2411 BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR 99396 123274 POLYGON ((22970.12 46112.01…
3239 BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN 79535 94824 POLYGON ((29720.12 38750.79…
3578 BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416 57084 80507 POLYGON ((32470.12 36585.73…
4349 BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A 101960 118967 POLYGON ((38720.12 33554.64…
4539 TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D 95639 124569 POLYGON ((40220.12 37018.74…

However, due to the heavily right-skewed distribution, not much can be glimpsed from the current geovisualisation using continuous scale fill. A data classification method will be used to replot the maps.

Jenks Natural Breaks

The Jenks optimization method, also called the Jenks natural breaks classification method, is a data clustering method designed to determine the best arrangement of values into different classes by seeking to reduce the variance within classes and maximize the variance between classes. For this exercise, the number of breaks is defined at n=6.

The following code chunks will review the geographical distribution of the weekdays and weekends peak hour commuting flows using a continuous scale fill.

The geographical distribution of the weekdays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code
tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekday_morn", "weekday_evening"),
          palette = "Blues",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekday Morning" = "weekday_morn", 
                         "Weekday Evening" = "weekday_evening")) + 
  tm_layout(title = c("Weekday Morning", "Weekday Evening")) +
  tm_facets(ncol = 1, free.scales = FALSE)
Takeaway

Using Jenks Natural Breaks to classify the weekday peak hour commuting flow reveals more potential hot spots in the housing estates. This will be verified in the second part of this study on Emerging Hot Spots Analysis.

The code chunk below displays the hexagon grids that fall into the last two classes (>156k) in greater detail.

Code
weekday_156k <- hex_grid_count2 %>%
  filter(weekday_morn >= 156000 | weekday_evening >= 156000) %>%
  select(index, bus_stop_names, weekday_morn, weekday_evening)

kable(weekday_156k)
index bus_stop_names weekday_morn weekday_evening geometry
1251 BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN 396701 551636 POLYGON ((13720.12 35286.69…
1438 LAKESIDE STN, OPP BLK 203, BLK 517, OPP LAKESIDE STN 94194 172864 POLYGON ((15220.12 36152.72…
1815 CHOA CHU KANG INT, LOT 1 / CHOA CHU KANG STN, OPP NORTHVALE CONDO, OPP CHOA CHU KANG STN 156028 289879 POLYGON ((18220.12 40482.84…
1904 BLK 644, BLK 231, BLK 503, BLK 225, BT BATOK INT, BLK 240 137323 164006 POLYGON ((18970.12 36585.73…
2054 BLK 431, CLEMENTI STN, CLEMENTI STN 202467 184460 POLYGON ((20220.12 32688.62…
2062 BT PANJANG STN EXIT A/ LRT, OPP BT PANJANG STN, AFT BT PANJANG STN, BT PANJANG INT, THE LINEAR, BT PANJANG STN EXIT B, OPP BLK 180, BLK 180 95420 230072 POLYGON ((20220.12 39616.82…
2411 BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR 313192 474311 POLYGON ((22970.12 46112.01…
3204 BLK 84B, OPP TOA PAYOH STN, BLK 138B, BLK 79C, TOA PAYOH BUS INT, TOA PAYOH BUS INT, TOA PAYOH STN 166638 248170 POLYGON ((29470.12 34853.68…
3239 BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN 212340 325858 POLYGON ((29720.12 38750.79…
3578 BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416 111428 299786 POLYGON ((32470.12 36585.73…
3859 NAUNG RESIDENCE, THE MIDTOWN, HOUGANG CTRL INT, BLK 302, HOUGANG CTRL INT, BLK 327 106496 183771 POLYGON ((34720.12 38750.79…
3924 BLK 248A, SENGKANG INT, COMPASSVALE INT, BLK 240, Sengkang Stn, BLK 259C, BLK 250A 77090 211180 POLYGON ((35220.12 41348.87…
3957 PUNGGOL STN / WATERWAY PT, PUNGGOL BUS INTERCHANGE, BEF BLK 264, OPP BLK 272C, BLK 272C 88871 179509 POLYGON ((35470.12 42647.91…
4018 BLK 102C, BEF PUNGGOL RD, AFT PUNGGOL RD, OPP BLK 103A 207583 146562 POLYGON ((35970.12 41781.88…
4349 BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A 187052 370328 POLYGON ((38720.12 33554.64…
4539 TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D 202366 446207 POLYGON ((40220.12 37018.74…
4635 OPP PASIR RIS STN EXIT A, PASIR RIS INT, PASIR RIS INT, AFT PASIR RIS ST 53, AFT PASIR RIS DR 3, BLK 586, BEF PASIR RIS ST 53, AFT PASIR RIS STN EXIT A, BEF PASIR RIS STN 88685 220510 POLYGON ((40970.12 39183.81…

All the hexagon grids are in housing estates.

The geographical distribution of the weekends / holidays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code
tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekend_morn", "weekend_evening"),
          palette = "Purples",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekend/PH Morning" = "weekend_morn", 
                         "Weekend/PH Evening" = "weekend_evening")) + 
  tm_layout(title = c("Weekend/PH Morning", "Weekend/PH Evening")) +
  tm_facets(ncol = 1, free.scales = FALSE)
Takeaway

Similarly, using Jenks Natural Breaks to classify the weekend peak hour commuting flow reveals more potential hot spots not just in the major housing estates, but also in the town area and the area surrounding Woodlands Checkpoint. This will be verified in the second part of this study on Emerging Hot Spots Analysis.

The code chunk below displays the hexagon grids that fall into the last two classes (>31k) in greater detail.

Code
weekend_31k <- hex_grid_count2 %>%
  filter(weekend_morn >= 31000 | weekend_evening >= 31000) %>%
  select(index, bus_stop_names, weekend_morn, weekend_evening)

kable(weekend_31k)
index bus_stop_names weekend_morn weekend_evening geometry
1251 BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN 108551 149047 POLYGON ((13720.12 35286.69…
1438 LAKESIDE STN, OPP BLK 203, BLK 517, OPP LAKESIDE STN 23964 37003 POLYGON ((15220.12 36152.72…
1778 J GATEWAY, JURONG EAST TEMP INT, OPP JEM 25234 37976 POLYGON ((17970.12 34853.68…
1815 CHOA CHU KANG INT, LOT 1 / CHOA CHU KANG STN, OPP NORTHVALE CONDO, OPP CHOA CHU KANG STN 62447 78360 POLYGON ((18220.12 40482.84…
1904 BLK 644, BLK 231, BLK 503, BLK 225, BT BATOK INT, BLK 240 39157 43066 POLYGON ((18970.12 36585.73…
2024 OPP REGENT PK, NAN HUA PR SCH, CLEMENTI INT, REGENT PK 37828 40308 POLYGON ((19970.12 33121.63…
2054 BLK 431, CLEMENTI STN, CLEMENTI STN 44441 52125 POLYGON ((20220.12 32688.62…
2062 BT PANJANG STN EXIT A/ LRT, OPP BT PANJANG STN, AFT BT PANJANG STN, BT PANJANG INT, THE LINEAR, BT PANJANG STN EXIT B, OPP BLK 180, BLK 180 48824 64095 POLYGON ((20220.12 39616.82…
2102 W’LANDS CHECKPT 42251 38354 POLYGON ((20470.12 46978.04…
2133 W’LANDS CHECKPT 28240 38286 POLYGON ((20720.12 47411.05…
2135 JOHOR BAHRU CHECKPT, JOHOR BAHRU CHECKPT 31773 38431 POLYGON ((20720.12 49143.1,…
2411 BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR 99396 123274 POLYGON ((22970.12 46112.01…
2854 HARBOURFRONT STN EXIT A, OPP VIVOCITY, HARBOURFRONT/VIVOCITY 24091 41675 POLYGON ((26720.12 27492.46…
2950 OPP TIONG BAHRU STN, TIONG BAHRU STN, CTRL GREEN CONDO 37980 35154 POLYGON ((27470.12 29657.53…
3014 OPP NGEE ANN CITY, BEF ORCHARD STN EXIT B, ORCHARD STN / LUCKY PLAZA, ORCHARD STN / TANGS PLAZA, OPP ORCHARD STN / ION 35047 54816 POLYGON ((27970.12 31389.58…
3166 OPP HONG LIM CPLX, CHINATOWN STN EXIT E, MAXWELL RD FC, NEW BRIDGE CTR, OPP SRI MARIAMMAN TP, CHINATOWN STN EXIT C 43599 45553 POLYGON ((29220.12 29224.51…
3204 BLK 84B, OPP TOA PAYOH STN, BLK 138B, BLK 79C, TOA PAYOH BUS INT, TOA PAYOH BUS INT, TOA PAYOH STN 71600 78369 POLYGON ((29470.12 34853.68…
3239 BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN 79535 94824 POLYGON ((29720.12 38750.79…
3292 ST JOSEPH’S CH, AFT BUGIS STN EXIT C, OPP BUGIS STN EXIT C, BRAS BASAH CPLX, HOTEL GRAND PACIFIC, BEF BENCOOLEN STN EXIT A, AFT WATERLOO ST, BEF WATERLOO ST 29286 38048 POLYGON ((30220.12 30956.57…
3578 BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416 57084 80507 POLYGON ((32470.12 36585.73…
3822 SIMS VILLE, BLK 1015, PAYA LEBAR STN EXIT D, OPP BLK 1015, PAYA LEBAR STN EXIT C, PAYA LEBAR STN EXIT B 28803 34103 POLYGON ((34470.12 33121.63…
3859 NAUNG RESIDENCE, THE MIDTOWN, HOUGANG CTRL INT, BLK 302, HOUGANG CTRL INT, BLK 327 46128 50760 POLYGON ((34720.12 38750.79…
3924 BLK 248A, SENGKANG INT, COMPASSVALE INT, BLK 240, Sengkang Stn, BLK 259C, BLK 250A 44740 53323 POLYGON ((35220.12 41348.87…
3957 PUNGGOL STN / WATERWAY PT, PUNGGOL BUS INTERCHANGE, BEF BLK 264, OPP BLK 272C, BLK 272C 41764 49806 POLYGON ((35470.12 42647.91…
4006 OPP PARKWAY PARADE, PARKWAY PARADE, OPP 112 KATONG 31559 37707 POLYGON ((35970.12 31389.58…
4018 BLK 102C, BEF PUNGGOL RD, AFT PUNGGOL RD, OPP BLK 103A 39349 41052 POLYGON ((35970.12 41781.88…
4349 BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A 101960 118967 POLYGON ((38720.12 33554.64…
4539 TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D 95639 124569 POLYGON ((40220.12 37018.74…
4635 OPP PASIR RIS STN EXIT A, PASIR RIS INT, PASIR RIS INT, AFT PASIR RIS ST 53, AFT PASIR RIS DR 3, BLK 586, BEF PASIR RIS ST 53, AFT PASIR RIS STN EXIT A, BEF PASIR RIS STN 40921 54305 POLYGON ((40970.12 39183.81…

All the hexagon grids are in housing estates.